Enriching Webpages with Semantic Information
نویسندگان
چکیده
This paper proposes a tool to automatically enrich webpages with semantic information by annotating keywords in the document with microdata markup. There are two case studies described and implemented in this paper. The first case study focuses on generating new webpages with microdata and the second case study focuses on enriching existing webpages with microdata. This paper also demonstrates the practicality of using schema.org terms in constructing a referenced ontology. Finally, a comparative study is conducted and the result shows that the proposed tool is more reliable in terms of performance and advanced features compared to other existing automatic microdata generator tools.
منابع مشابه
A Novel Architecture for Detecting Phishing Webpages using Cost-based Feature Selection
Phishing is one of the luring techniques used to exploit personal information. A phishing webpage detection system (PWDS) extracts features to determine whether it is a phishing webpage or not. Selecting appropriate features improves the performance of PWDS. Performance criteria are detection accuracy and system response time. The major time consumed by PWDS arises from feature extraction that ...
متن کاملExtracting Semantic Networks among Named Entities from Websites
To enable machine processing of webpages, it is important to identify the relationships among named entities. Named entities, like, people, organizations, and places are important pieces of information that must be extracted. The scale of the web indicates that manual extraction is not feasible. We propose a system that automatically constructs a semantic network of named entities from webpages...
متن کاملImproving the Compression Efficiency for News Web Service Using Semantic Relations Among Webpages
Both compression and decompression play important roles in a web service system. High compression ratio helps to save the storage, while fast decompression contributes to decreasing the response time of service. Specifically focusing on the news web service, this paper proposes a compression mechanism to improve the efficiency of compression and decompression simultaneously by taking advantage ...
متن کاملKnowledge extraction from webpages
This article presents a system to extract Knowledge from webpages by producing semantic annotations. taking into account semantic information from the domain to annotate an element in a webpage implies solving two problems : (1) identifying the syntactic structure of this element in the webpage and (2) identifying the most specific concept (in terms of subsumption) of the ontology that will be ...
متن کاملKeyword Extraction for Webpage Clusters
The volume of unstructured information presented on the Internet is constantly increasing, together with the total amount of websites and their contents. To process this vast amount of information it is important to distinguish different clusters of related webpages. Such clusters are used, for example, for template induction, keyword extraction, and recommendation algorithms. A variety of appl...
متن کامل